European Bioinformatics Institute

The European Bioinformatics Institute (EBI) is a centre for research and services in bioinformatics, and is part of European Molecular Biology Laboratory (EMBL). It is located on the Wellcome Trust Genome Campus in Hinxton, Great Britain.

Contents

About the EMBL-EBI

The roots of the EMBL-EBI lie in the EMBL Nucleotide Sequence Data Library[1] (now known as EMBL-Bank), which was established in 1980 at the EMBL laboratories in Heidelberg, Germany and was the world's first nucleotide sequence database[2]. The original goal was to establish a central computer database of DNA sequences, rather than have scientists submit sequences to journals. What began as a modest task of abstracting information from literature soon became a major database activity with direct electronic submissions of data and the need for highly skilled informatics staff. The task grew in scale with the start of the genome projects, and grew in visibility as the data became relevant to research in the commercial sector. It soon became apparent that the EMBL Nucleotide Sequence Data Library needed better financial security to ensure its long-term viability and to cope with the sheer scale of the task.

There was also a need for research and development to provide services, to collaborate with global partners to support the project, and to provide assistance to industry. To this end, in 1992, the EMBL Council voted to establish the European Bioinformatics Institute and to locate it at the Wellcome Trust Genome Campus in the United Kingdom where it would be in close proximity to the major sequencing efforts at the Sanger Institute. From 1992 through to 1995, a gradual transition of the activities in Heidelberg took place, until in September 1995 the EMBL-EBI occupied its current location on the Wellcome Trust Genome Campus.

When the EMBL-EBI moved to Hinxton it hosted two databases, one for nucleotide sequences (the EMBL Data Library, now known as EMBL-Bank) and one for protein sequences (Swiss-Prot–TrEMBL, now known as UniProt). Since then, the EMBL-EBI has diversified to provide data resources in all the major molecular domains and expanded to include a broad research base. It provides user support and offers advanced training in bioinformatics.[3]

Funding

As part of EMBL, the largest part of EBI's funding comes from the governments of EMBL's 20 member states. Other major funders include the European Commission, Wellcome Trust, US National Institutes of Health, UK Research Councils, EBI's industry partners and the UK Department of Trade and Industry. In addition, the Wellcome Trust generously provides the facilities for the EMBL-EBI on its Genome Campus at Hinxton, and the UK Research Councils have also provided funds for EBI's facilities in Hinxton.

Data resources and tools at the EBI

The EBI acts as a data centre providing several databases[4] and web services[5]:

ENA Genomes Gene Expression Literature Sequence Similarity & Analysis
UniProt Nucleotide Sequences Molecular Interactions Taxonomy Pattern and Motif Searches
ArrayExpress Protein Sequences Reactions and Pathways Ontologies Structure Analysis
Ensembl Macromolecular Structures Protein Families Text Mining
InterPro Small Molecules Enzymes Downloads
PDBe SOAP & REST Web Services Carbohydrate structures

Full Database and Services indices at the EBI

Groups at the EBI

The EBI hosts many different groups, working on research, providing services to the bioinformatics community or a mixture of both.

EBI Administration (Mark Green)
Bertone Group (Paul Bertone)
Genomic analysis of developmental pathways, with a focus on differentiation and lineage commitment in mammalian embryonic stem cells.
ChEMBL Group (John Overington)
The ChEMBL group's research focuses on mapping the interactions and functional effects of small molecules binding to their macromolecular targets.
Computational Systems Neurobiology Group (Nicolas Le Novère)
The interests of the group Computational Neurobiology revolve around signal transduction in neurons, ranging from the molecular structure of membrane proteins involved in neurotransmission to modelling signalling pathways. A strong focus is the molecular and cellular basis of synaptic plasticity in neurons of the basal ganglia. The group also provide tools and resources for computational systems biology, including the Systems Biology Ontology (SBO), MIRIAM Resources, plus software to develop models. A main project of the group is BioModels Database, which allows biologists to store, search and retrieve published mathematical models of biological interest.
Enright Group (Anton Enright)
This group will focus on a number of problems relating to the prediction of the functions of genes and proteins in living organisms.
External Services Group (Rodrigo Lopez)
Develops and maintains Web Services APIs for most tools available from EMBL-EBI, The EB-eye EBI's Search Engine, EBI SRS servers, 2can for external as well as internal users. See also EBI External Services
Ensembl genomes Team (Paul Kersey)
The main focus of the team is currently the development of Ensembl Genomes, the expansion of the use of the Ensembl system from its current focus of vertebrate genomes to cover important species from all domains of life, with the launch of new sites for Ensembl Metazoa, Ensembl Plants and Ensembl Bacteria in late 2008 and Ensembl Plants and Ensembl Fungi in the first half of 2009.
GO Editorial Office (Jane Lomax)
The GO Editorial Office at EBI coordinates the development and maintenance of the GO vocabularies, and contributes to several other GO project efforts, including documentation, web presence, software testing, and user support.
Graham Cameron
Associate Director of the EBI.
Goldman Group (Nick Goldman)
This group is developing methods for the analysis of DNA and amino acid sequences to study evolution.
Huber Group (Wolfgang Huber)
Focuses on gene transcription and protein–DNA binding analysis with DNA microarrays; statistical computing and high-throughput cellular assays and genetic interaction screens.
Industry Support (Dominic Clark)
The EBI supports Industry through two programmes: the EBI Industry Programme is a well established, subscription-based programme for large companies whereas the SME Support Forum offers support to smaller companies that are not eligible to join the Industry Programme.
InterPro Group (Sarah Hunter)
Develops and maintains the InterPro project, an integrated documentation resource for protein families, domains and functional sites that is used for small and large-scale functional classification of proteins.
Literature Services (Peter Stoehr/Johanna McEntyre)
This group is in charge of the development and maintenance of CitExplore and related services.
Regulation Group (Nick Luscombe)
Focuses on the genomic analysis of regulatory systems.
Microarray Group (Alvis Brazma)
Uses microarray technology to analyse the sequence data from the genome projects to identify which genes are expressed in a particular cell type of an organism.
MicroArray Technical Team (Ugis Sarkans)
PDBe - Protein Data Bank Europe (formerly MSD) (Gerard Kleywegt)
Serves a list of protein quaternary structures (or macromolecules) for every entry in the Protein Data Bank (PDB) [1]. It is also a member of the Worldwide Protein Data Bank, and is of the three worldwide sites that accept, process and distribute macromolecular structure data. This group aims to improve the consistency and quality of the world archive of data on macromolecular structures by integrating current database and informatics technologies with a solid core of expertise in structural biology. This group also hosts the canonical version of the Eurocarb database[2].
Outreach and Training Team (Cath Brooksbank)
Coordinates firstly communicating the scientific mission and activities of the EBI to the community and secondly the scientific training programme of the EBI. The EBI’s user-training programme equips users of the EBI’s bioinformatics services with the knowledge that they need to use our data resources.
PANDA Protein And Nucleotide DAtabase groupEnsembl (Rolf Apweiler & Ewan Birney)
Provides, for almost 35 species, a genome browser, public access to the MySQL databases of annotations shown in the browser, and a Perl API for accessing the database. The group is divided fairly equally between the EBI and the Wellcome Trust Sanger Institute with gene builders and web team on the Sanger side, and core database and comparative genomics teams at the EBI.
Proteomics Services Team (Henning Hermjakob)
Provides databases and tools for the deposition, distribution and analysis of proteomics and proteomics-related data.
Rebholz Group (Dietrich Rebholz-Schuhmann)
Focuses on extraction of facts from scientific literature in molecular biology. The main methods are based on Finite State Automatons (FSAs). In the past has worked on the identification of protein–protein interactions, acronyms and descriptions of mutations.
Rice Group (Peter Rice)
This group is investigating & advising on the e-Science & Grid technology requirements of the EMBL-EBI, through application development plus participation in standards development.
Chemoinformatics and Metabolism (Christoph Steinbeck)
The Steinbeck group's research in molecular informatics focuses on the understanding of the small-molecule metabolism of living organism, including methods for computer-assisted structure elucidation of biological metabolites and simulations of metabolic pathways.
Systems Group (Petteri Jokinen)
Maintains and develops state-of-art computing infrastructure on which most EBI operations are run.
Thornton Group (Janet Thornton)
Using biomolecular structures, tries to understand enzyme active sites, protein–protein interactions, protein–ligand interactions, protein–DNA interactions and structure and modelling.
Vertebrate Genomics Group (Paul Flicek)
This part of the Panda Nucleotides Group (The Vertebrate Genomics Group) focuses on functional annotation of the genome including methods for incorporating high-throughput epigenetic data for expanding and understanding the collection of human variation.
Database Research and Development Group (Weimin Zhu)
The Database Research and Development Group Conduct research and development on the database-related challenges. Biomolecular databases are becoming increasingly large, complex and interconnected. This increase of data scale, complexity and the need of interoperability means that there are many fundamental challenges in the database development, deployment and distribution. The group will be leading the EBI's research into database technologies, looking both at solutions from other fields with similar datasets and examining new, cutting edge technologies from database research.[6]

Education, training and user support

The EBI provides many different education, training, user support and outreach events,

The Bioinformatics Roadshow
a travelling user-training programme that is tailored to the needs of users of Europe’s main data resources.
The EBI Hands-on User Training programme
a series of short courses, held in the EBI’s IT training suite, that aims to familiarise experimental researchers with the EBI’s core data resources.
2can Bioinformatics User Support Portal.
Provides short and concise introductions to basic concepts in molecular and cell biology and bioinformatics. It focuses on making it as easy for the user to understand which tools and databases are available from the EBI and collaborating sites. It also provides links to other sites where similar resources are maintained and well supported.

EBI hosted projects

Several research projects are hosted at the EBI including:

See also

References

  1. ^ Stoesser, G.; Sterk, P.; Tuli, M.; Stoehr, P.; Cameron, G. (1997). "The EMBL Nucleotide Sequence Database". Nucleic Acids Research 25 (1): 7–14. doi:10.1093/nar/25.1.7. PMC 146376. PMID 9016493. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=146376.  edit
  2. ^ Kneale, G.; Kennard, O. (1984). "The EMBL nucleotide sequence data library". Biochemical Society transactions 12 (6): 1011–1014. PMID 6530028.  edit
  3. ^ Wright, V. A.; Vaughan, B. W.; Laurent, T.; Lopez, R.; Brooksbank, C.; Schneider, M. V. (2010). "Bioinformatics training: Selecting an appropriate learning content management system--an example from the European Bioinformatics Institute". Briefings in Bioinformatics 11 (6): 552–562. doi:10.1093/bib/bbq023. PMID 20601435.  edit
  4. ^ Brooksbank, C.; Cameron, G.; Thornton, J. (2009). "The European Bioinformatics Institute's data resources". Nucleic Acids Research 38 (Database issue): D17–D25. doi:10.1093/nar/gkp986. PMC 2808956. PMID 19934258. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2808956.  edit
  5. ^ McWilliam, H.; Valentin, F.; Goujon, M.; Li, W.; Narayanasamy, M.; Martin, J.; Miyar, T.; Lopez, R. (2009). "Web services at the European Bioinformatics Institute-2009". Nucleic Acids Research 37 (Web Server issue): W6–W10. doi:10.1093/nar/gkp302. PMC 2703973. PMID 19435877. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2703973.  edit
  6. ^ EMBL-EBI web site